Image velocity estimation

ABSTRACT

A method of image velocity estimation in image processing which uses a block matching technique in which a similarity measure is used to calculate the similarity between blocks in successive images. The similarity measure is used to calculate a probability density function of candidate velocities. The calculation is on the basis of an exponential function of the similarity in which the similarity is multiplied by a parameter whose value is independent of position in the frame. The candidate velocities are thresholded to exclude those having a low probability. The value of the parameter and threshold are optimised together by coregistering all frames to the first frame, calculating the registration error, and varying them to minimise the registration error. The similarity measure is normalised with respect to the size of the block, for example by dividing it by the number of image samples in the blocks being compared. The similarity measure used may be the CD 2-bis  similarity measure in which the mean and standard deviation of the two blocks being compared are adjusted to be the same before calculation of the similarity. This makes the similarity measure particularly suitable for ultrasound images. Further, block matching may be conducted across three frames of the sequence by comparing the intensities in blocks in the first and third, and second and third of the frames and finding the block in the third frame which best matches the block in the second frame and that block&#39;s corresponding position in the first frame.

The present invention relates to image processing, and in particular toimproving the estimation of image velocity in a series of image frames.

There are many imaging situations in which a subject in an image is inmotion and it is desired to track or measure the movement of the subjectfrom frame to frame. This movement is known as optical flow or imagevelocity. Such estimation or measurement of image velocity may be done,for example, to improve the efficiency of encoding the image, or toallow enhancement of the display of, or measurement of, the movement ofsome particular tracked part of the image to assist an observer tryingto interpret the image. Many techniques have been proposed and used forimage velocity estimation and one of the basic techniques is known asblock matching. In block matching, blocks of pixels are defined in afirst frame and the aim is then to identify the position of those blocksin a second subsequent frame. One approach is to compare the intensitiesof the pixels in the block in the first frame with successive, displacedcandidate blocks in the second frame using a similarity measure, such asthe sum of square differences. The block in the second frame which givesthe minimum of the sum of square differences (or gives the best matchwith whichever similarity measure is chosen) is taken to be the sameblock displaced by movement of the subject. Repeating the process forsuccessive blocks in the first image frame gives an estimate for thesubject motion at each position in the image (the image velocity field).

FIG. 1 schematically illustrates the idea. Two frames are shown, frame 1and frame 2. These may be, but are not necessarily, successive frames ina sequence. Frame 1 is divided up into square blocks of pixels having aside length of (2n+1) pixels, ie. from −n to +n about a central pixel(x, y) in each block. One block W_(c) is illustrated in FIG. 1. A searchwindow W_(s) is defined in the second frame around the position of thecorresponding central pixel (x, y) in the second frame. As illustratedin FIG. 1 it is a square search region of side length (2N+1) pixels. Theintensities of the block W_(c) of pixels in frame 1 are then compared atall possible positions of the block in the search window W_(s). So, forexample, the first comparison is made with the corresponding (2n+1) by(2n+1) block in the top left hand corner of the search window W_(s), andthen with such a block displaced one pixel to the right, and then ablock displaced two pixels to the right and so on until the end of thesearch window is reached. The procedure is then repeated for a row ofcandidate blocks displaced one pixel down in the search window from thefirst row, and so on until the bottom of the search window is reached.The similarity measure may, for example, be a sum of square differences:$\begin{matrix}{{E_{c}\left( {u,v} \right)} = {\sum\limits_{i = {- n}}^{n}{\sum\limits_{j = {- n}}^{n}\left\lbrack {{I\left( {{x + i},{y + j},t} \right)} - {I\left( {{x + u + i},{y + v + j},{t + 1}} \right)}} \right\rbrack^{2}}}} & (1)\end{matrix}$for each value of (u, v) for −N≦u, v≦N and where i and j index throughthe block W_(c) centered on the pixel (x , y) in the x and y directionsrespectively, and u and v are the different values of displacement whichindex over the search window W_(s). The values u and v can, given thetime difference between the frames, be regarded as a velocity. Thisgives a value of E_(c) for each estimated displacement. The estimateddisplacement with the minimum E_(c) is often taken as the actualdisplacement of the block. This is repeated for all positions in frame 1to give a velocity field, and then for all frames in the sequence.Different similarity measures may, of course, be used. Also, it is notalways necessary to conduct the block matching on all frames of thesequence, or for all pixels or blocks in each frame. The block W_(c) maysubsample the pixels in the frame and the candidate displacements u andv may be indexed by more than one pixel. Thus the searching may be atdifferent resolutions and scales. Sometimes a multi-scale and/ormulti-resolution approach may be used in which block matching is firstperformed at a coarse resolution or large scale, and subsequently atsuccessively finer resolutions, using the previously calculated velocityvalues to reduce the amount of searching required at finer resolutions.

Medical images present many difficulties in image processing because ofthe typically high level of noise found in them. For example, thetracking of cardiac walls in cardiac ultrasound images is difficultbecause of the high level of noise found in ultrasound images and alsobecause of the nature of the cardiac motion. In particular, the cardiacmotion varies during the cardiac cycle. Various ways of identifying andtracking cardiac walls in echocardiograms have been proposed in WO01/16886 and WO 02/43004, but it is a difficult task in which there isroom for improvement.

A development of the block matching technique as described above hasbeen proposed by A. Singh, “Image-flow computation: Anestimation-theoretic framework and a unified perspective,” CVGIP: Imageunderstanding, vol. 65, no. 2, pp. 152-177, 1992 which is incorporatedherein by reference. In this approach both conservation information,e.g. from a block matching technique as described above, andneighbourhood information (i.e. looking at the velocities of surroundingpixels) are combined with weights based on estimates of their associatederrors. Thus in a first step based on conservation information thesimilarity values E_(c) are used in a probability mass function tocalculate a response R_(c)whose value at each position in the searchwindow represents the likelihood of the corresponding displacement. Theprobability mass function is given by $\begin{matrix}{{{R_{c}\left( {u,v} \right)} = {\frac{1}{Z}{\exp\left( {{- k}\quad{E_{c}\left( {u,v} \right)}} \right)}}},} & (2)\end{matrix}$where Z is defined such that all probabilities sum to unity i.e.:$\begin{matrix}{{\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{R_{c}\left( {u,v} \right)}}} = 1} & (3)\end{matrix}$

In the function for the response the parameter k is chosen at eachposition such that the maximum response in the search window is close tounity (0.95 before normalisation) for computational reasons. Theexpected value of the velocity is then found by multiplying eachcandidate value by its probability and summing the results:$\begin{matrix}{u_{cc} = {\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{u\quad{R_{c}\left( {u,v} \right)}}}}} & (4) \\{v_{cc} = {\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{v\quad{R_{c}\left( {u,v} \right)}}}}} & (5)\end{matrix}$

Another velocity estimate may be obtained by the use of neighbourhoodinformation. In other words, the velocity at each pixel is unlikely tobe completely independent of the velocity of its neighbours. Thus,assuming that the velocity of each pixel in a small neighbourhood W_(p)has been estimated, the velocity estimates for each pixel can be refinedby using the velocity of its neighbouring pixels. Clearly it is morelikely that the velocities of closer neighbours are more relevant thanpixels which are further away. Therefore weights are assigned tovelocities calculated for the neighbouring pixels, and the weights dropwith increasing distance from the central pixel (a 2-D Gaussian mask inthe window W_(p) of size (2w+1)(2w+1) is used). These weights can beinterpreted as a probability mass function R_(n)=(u_(i),v_(i)) where${\sum\limits_{x_{i} \in W_{p}}{R_{n}\left( {u_{i},v_{i}} \right)}} = 1$(i is an index for pixels in W_(p)) in a uv space. Now, the velocityestimate U=({overscore (u)},{overscore (v)}) for the central pixel usingneighbourhood information can be calculated as: $\begin{matrix}{\overset{\_}{u} = {\sum\limits_{x_{i} \in W_{p}}{u_{i}{R_{n}\left( {u_{i},v_{i}} \right)}}}} & (6) \\{\overset{\_}{v} = {\sum\limits_{x_{i} \in W_{p}}{v_{i}{R_{n}\left( {u_{i},v_{i}} \right)}}}} & (7)\end{matrix}$

An important aspect of the Singh approach is that a covariance matrix iscalculated for each velocity estimate, for both the estimates based onconservation information and the estimates based on neighbourhoodinformation. These covariance matrices can be used to calculate errorswhich are used as weights when combining the two estimates together togive a fused, optimal estimate.

The covariance matrix corresponding to the estimate U_(cc) is given by:$\begin{matrix}{S_{cc} = \begin{bmatrix}{\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{\left( {u - u_{cc}} \right)^{2}{R_{c}\left( {u,v} \right)}}}} & {\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{\left( {u - u_{cc}} \right)\left( {v - v_{cc}} \right){R_{c}\left( {u,v} \right)}}}} \\{\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{\left( {u - u_{cc}} \right)\left( {v - v_{cc}} \right){R_{c}\left( {u,v} \right)}}}} & {\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{\left( {v - v_{cc}} \right)^{2}{R_{c}\left( {u,v} \right)}}}}\end{bmatrix}} & (8)\end{matrix}$

The covariance matrix corresponding to the neighbourhood estimate{overscore (U)} is as follows: $\begin{matrix}{S_{n} = \begin{bmatrix}{\sum\limits_{x_{i} \in W_{p}}{\left( {u_{i} - \overset{\_}{u}} \right)^{2}\quad{R_{n}\left( {u_{i},v_{i}} \right)}}} & {\sum\limits_{x_{i} \in W_{p}}{\left( {u_{i} - \overset{\_}{u}} \right)\left( {v_{i} - \overset{\_}{v}} \right)\quad{R_{n}\left( {u_{i},v_{i}} \right)}}} \\{\sum\limits_{x_{i} \in W_{p}}{\left( {u_{i} - \overset{\_}{u}} \right)\left( {v_{i} - \overset{\_}{v}} \right)\quad{R_{n}\left( {u_{i},v_{i}} \right)}}} & {\sum\limits_{x_{i} \in W_{p}}{\left( {v_{i} - \overset{\_}{v}} \right)^{2}\quad{R_{n}\left( {u_{i},v_{i}} \right)}}}\end{bmatrix}} & (9)\end{matrix}$

Thus these steps give two estimates of velocity, U_(cc) and {overscore(U)}, from conservation and neighbourhood information respectively, eachwith a covariance matrix representing their error. An estimate U ofvelocity that takes both conservation information and neighbourhoodinformation into account can now be computed. The distance of this newestimate from {overscore (U)}, weighted appropriately by thecorresponding covariance matrix, represents the error in satisfyingneighbourhood information. This can be termed neighbourhood error.Similarly the distance of this estimate from U_(cc), weighted by itscovariance matrix, represents the error in satisfying conservationinformation. This may be termed conservation error. The sum ofneighbourhood and conservation errors represents the squared error ofthe fused velocity estimate:ε²=(U−U _(cc))^(T) S _(cc) ⁻¹(U−U _(cc))+(U−{overscore (U)})^(T) S _(n)⁻¹(U−{overscore (U)})   (10)

The optimal value of velocity is that value which minimises this errorand can be obtained by setting the gradient of the error with respect toU equal to zero giving: $\begin{matrix}{{\hat{U}}_{op} = {\left\lbrack {S_{cc}^{- 1} + S_{n}^{- 1}} \right\rbrack^{- 1}\left\lbrack {{S_{cc}^{- 1}U_{cc}} + {S_{n}^{- 1}\overset{\_}{U}}} \right\rbrack}} & (11)\end{matrix}$

Because the values of {overscore (U)} and S_(n) are derived on theassumption that the velocity of each pixel of the neighbourhood is knownin advance, in practice equation (11) is solved in an iterative process(via Gauss-Seidel relaxation) with the initial values of the velocity ateach pixel being taken from the conservation information alone. Thus:$\begin{matrix}\left\{ \begin{matrix}{U^{0} = U_{cc}} \\{U_{op}^{m + 1} = {\left\lbrack {S_{cc}^{- 1} + S_{n}^{- 1}} \right\rbrack^{- 1}\left\lbrack {{S_{cc}^{- 1}U_{cc}} + {S_{n}^{- 1}\overset{\_}{U_{op}^{m}}}} \right\rbrack}}\end{matrix} \right. & (12)\end{matrix}$

where the superscript m refers to the iteration number. Iterationcontinues until the difference between two successive values of U_(op)is smaller than a specified value.

While this technique usefully combines conservation and neighbourhoodinformation, and does so in a probabilistic way, it does not always workwell in practice, particularly with noisy images of the type found inmedical imaging and ultrasound imaging in general.

Another common problem in image velocity estimation using matchingtechniques is known as the multi-modal response (i.e. due to thewell-known aperture problem, for example, or mismatching especially whenthe size of the search window is large). A common way to reduce theeffect of the multi-modal response is to compare the intensities inthree frames, rather than just two as explained above. So the similaritybetween blocks W_(c) in two frames x_(i) and y_(i) is found, and betweentwo blocks W_(c) in y_(i) and z_(i), as shown in FIG. 2 of the drawings.In the two frame comparison the intensities in a block W_(c) in oneframe x_(i) at time t are compared with the intensities in acorresponding block displaced by the candidate velocity (u,v) in thenext frame y_(i) at time t+1 for all values of (u, v) in the searchwindow W_(s). In the three frame approach, the intensities in the blockW_(c) are also compared with the intensities in the block displaced by(2u, 2v) in the next-but-one frame z_(i) at time t+2, again for valuesof (u, v) in the search window W_(s). In the case of using sum-of-squaredifferences as the similarity measure this can be written as:$\begin{matrix}{{E_{c}\left( {u,v} \right)} = {\sum\limits_{i = {- n}}^{n}{\sum\limits_{j = {- n}}^{n}\left( \begin{matrix}{\left\lbrack {{I\left( {{x + i},{y + j},t} \right)} - {I\left( {{x + u + i},{y + v + j},{t + 1}} \right)}} \right\rbrack^{2} +} \\\left\lbrack {{I\left( {{x + i},{y + j},t} \right)} - {i\left( {{x + {2u} + i},{y + {2v} + j},{t + 2}} \right)}} \right\rbrack^{2}\end{matrix} \right.}}} & (13)\end{matrix}$

where the first term is comparing blocks in the frames at t and t+1separated by a displacement (u, v) and the second term is comparingblocks in the frames at t and t+2 separated by twice that, i.e. (2u,2v). This amounts to assuming that the velocity is constant across threeframes of the sequence. In other words, for three frames at times t, t+1and t+2, it is assumed that the displacements between t and t+1 are thesame as the displacements between t+1 and t+2. This assumption isreasonable for high frame rate sequences, but is poor for low frame ratesequences, such as are encountered in some medical imaging techniques,including some ultrasound imaging modalities.

The present invention is concerned with improvements to block matchingwhich are particularly effective for medical images, especiallyultrasound images, which are inherently noisy.

A first aspect of the invention provides a method of processing asequence of image frames to estimate image velocity through the sequencecomprising:

block matching using a similarity measure by comparing the intensitiesin image blocks in two frames of the sequence and calculating thesimilarity between the said blocks on the basis of their intensities,calculating from the similarity a probability measure that the twocompared blocks are the same, and estimating the image velocity based onthe probability measure, wherein the probability measure is calculatedusing a parametric function of the similarity which is independent ofposition in the image frames.

Preferably the parameters of the parametric function are independent ofposition in the image frames. The function may be a monotonic, e.g.exponential, function of the similarity, in which the similarity ismultiplied by a positionally invariant parameter. The parameters may beoptimised by coregistering the frames in the sequence on the basis ofthe calculated image velocity, calculating a registration error andvarying at least one of the parameters to minimise the registrationerror. The registration error may be calculated from the difference ofthe intensities in the coregistered frames, for example the sum of thesquares of the differences.

Thus in the particular example of the approach proposed by Singh thevalue of parameter k is set for each position (so that the maximumresponse in the search window is close to unity), meaning that k variesfrom position to position over the frame. However, with this aspect ofthe present invention the value of k is fixed over the frame—it does notvary from position to position within the frame. It should be noted thatbecause k is used in a highly non-linear (exponential) function incalculating the response (probability), the velocity and error estimatesare not uniform, because variations in the value of k have a largeeffect. With this aspect of the present invention, on the other hand, kis constant for all pixels in the image, so the processing is uniformacross the image and from frame to frame.

The value of k may be optmised, as mentioned, for example by registeringall frames in the sequence to the first frame, i.e. using the calculatedimage velocity to adjust the image position to cancel the motion—whichif the motion correction were perfect would result in the images in eachframe registering perfectly, and calculating the registration error—e.g.by calculating the sum of square differences of the intensities. Thevalue of k is chosen which gives the minimum registration error.

The calculated similarity may be normalised by dividing it by the numberof pixels in the block, or the number of image samples used in the block(if the image is being sub-sampled).

Thus, again in the particular example above, the value of k in equation(2) above for R_(c) may be replaced by k/(2n+1)². This means that thevalue of k does not need to be changed if the block size is changed. Inparticular, it does not need to be re-optimised, so that once it hasbeen optimised for a given application (e.g. breast ultrasound) usingone frame sequence at one scale and resolution, the same value of k maybe used for the same application on other sequences at other scales andresolutions.

The probability measure may be thresholded such that motions in theimage velocity having a probability less than a certain threshold areignored. The threshold may be optimised by the same process as used foroptimisation of the parameter k above, i.e. by coregistering the framesin the sequence on the basis of the calculated image velocity,calculating a registration error and varying the threshold to minimiseregistration error. The threshold may be positionally independent.

A second aspect of the invention relates to the similarity measure usedin image velocity estimation and provides that the intensities in theblocks W_(c) in the frames being compared are normalised to have thesame mean and standard deviations before the similarity is calculated.The similarity measure may be the CD₂ similarity measure (rather thanthe sum of square differences of Singh), which is particularly suited toultrasound images (see B. Cohen and I. Dinstein, “New maximum likelihoodmotion estimation schemes for noisy ultrasound images”, PatternRecognition 35 (2002), pp 455-463).

A third aspect of the invention modifies the approach of Singh toavoiding multi-modal responses by assuming that the observed movingtissue conserves its statistical behaviour through time (at least forthree to four consecutive frames), rather than assuming a constantvelocity between three frames.

This aspect of the invention provides for block matching across threeframes of the sequence by comparing the intensities in blocks in thefirst and third and the second and third of the three frames, andcalculating the similarity on the basis of the compared intensities.

The blocks in the first and second frames are preferably blockscalculated as corresponding to each other on the basis of a previousimage velocity estimate (i.e. the image velocity estimate emerging fromprocessing preceding frames).

Thus the method may comprise defining for each block in the second framea search window encompassing several blocks in the third frame, andcalculating the similarity of each block in the search window to thesaid block in the second frame and to the corresponding position of thatsaid block in the first frame (as deduced from the previous imagevelocity estimate). Thus this avoids assuming that the velocity remainsthe same through the three frames. It is therefore suited to image framesequences having a relatively low frame rate, where the assumption ofconstant velocity does not tend to hold.

The different aspects of the invention may advantageously be combinedtogether, e.g. in an overall scheme similar to that of Singh. Thus, asin the Singh approach the estimated image velocity using the techniqueabove may be obtained by summing over the search window the values ofeach candidate displacement multiplied by the probability measurecorresponding to that displacement. Further, the estimate may be refinedby modifying it using the estimated image velocity of surroundingpositions—so-called neighbourhood information.

The techniques of the invention are particularly suitable for noisyimage sequences such as medical images, especially ultrasound images.

The invention also provides apparatus for processing images inaccordance with the methods defined above. The invention may be embodiedas a computer program, for example encoded on a storage medium, whichexecutes the method when run on a suitably programmed computer.

The invention will be further described by way of example, withreference to the accompanying drawings in which:

FIG. 1 illustrates schematically a block matching process;

FIG. 2 illustrates schematically a similarity measure calculation usinga constant velocity assumption for three frames;

FIG. 3 illustrates a similarity measure calculation using the assumptionof statistical conservation of moving tissue for three frames;

FIG. 4 is a flow diagram of an optimisation process used in oneembodiment of the invention;

FIG. 5 illustrates the overall process of one embodiment of theinvention; and

FIG. 6 illustrates the optimisation of k and T for a breast ultrasoundimage sequence.

Given a sequence of image frames in which it is desired to calculate theimage velocity, the first aspect of the invention concerns thesimilarity measure used, i.e. the calculation of E_(c)(u, v). While theimage processing algorithm proposed by Singh uses the sum of squaredifferences as a similarity measure, other similarity measures such asCD₂ and normalised crossed correlation (NCC) are known. In thisembodiment a modified version of the CD₂ similarity measure is used.Using the CD₂ similarity measure the most likely value of the velocityis defined as: $\begin{matrix}{{\hat{v}}_{i}^{ML} = {{\max\limits_{v_{i}}{\sum\limits_{j = 1}^{{2n} + 1}x_{ij}}} - y_{ij} - {\ln\left( {\exp\left( {{2\left( {x_{ij} - y_{ij}} \right)} + 1} \right)} \right.}}} & (14)\end{matrix}$where here i refers to the block, j indexes the pixels in the block,there are 2n+1 pixels in the block, and x_(ij) and y_(ij) are theintensities in the two blocks being compared.

This similarity measure is better for ultrasound images than others suchas sum-of-square differences or normalised cross-correlation because ittakes into account the fact that the noise in an ultrasound image ismultiplicative Rayleigh noise, and that displayed ultrasound images arelog-compressed. However it assumes that the noise distribution in bothof the blocks W_(c) is the same and this assumption is not correct forultrasound images. The attenuation of the ultrasound waves introducesinhomogeneities in the image of homogeneous tissue. The time gain andthe lateral gain compensations (compensating respectively for theeffects that deeper tissue appears dimmer and for intensity variationsacross the beam) which are tissue independent and generally constant fora given location during the acquisition, do not compensate fully for theattenuation. Thus in this embodiment of the invention an intensitynormalisation is conducted before calculation of the CD₂ similaritymeasure. This is achieved by making sure that the two blocks W_(c) ofdata have at least the same mean and variance. In more detail, theoriginal intensity values x_(ij) and y_(ij) above are transformed intonew values of {tilde over (x)}_(ij) and {tilde over (y)}_(ij) bysubtracting the mean and dividing by the standard deviation (square rootof the variance) of the intensity values in the block. This gives a newsimilarity measure which can be denoted CD_(2-bis) as follows:$\begin{matrix}{E_{i}^{{CD}_{2\text{-}{bis}}} = {{\sum\limits_{j = 1}^{{2n} + 1}{\overset{\sim}{x}}_{ij}} - {\overset{\sim}{y}}_{ij} - {\ln\left( {{\exp\left( {2\left( {{\overset{\sim}{x}}_{ij} - {\overset{\sim}{y}}_{ij}} \right)} \right)} + 1} \right)}}} & (15)\end{matrix}$

This is the similarity measure used in this embodiment to calculate thevalues of E_(cc) used.

To avoid multi-modal responses, the similarity measure may be calculatedover three consecutive frames. However, rather than making the normalconstant velocity assumption as mentioned above and described inrelation to FIG. 2, which results in the similarity measure being basedon comparing the first frame at time t with the next frame at time t+1and the third frame at t+2, instead the result of calculating thevelocities between the preceding frame at time t−1 and the current frameat time t are used. Given a block in frame x_(i) at time t, which iscompared to blocks in the search window W_(s) in frame y_(i) at the timet+1, the position of that block in the preceding frame at time t−1(denoted o_(i)) has already been calculated and so its position can bedenoted (x-u₀, y-v₀) in the preceding frame where (u₀, v₀) was theresult of the preceding velocity (image velocity) calculation. Thus inthe three frame approach in this embodiment of the invention theintensities of each candidate block in the search window W_(s) arecompared with the intensities of the block at (x, y) in the frame x_(i)at time t, and also with the calculated position (x-u₀, y-v₀) of thatblock in the frame or at time t−1. A value of E is calculated for eachcomparison (of x_(i) and y_(i) and o_(i) and y_(i)) and the values aresummed.

This is illustrated schematically in FIG. 3. The approach is applicablewhatever similarity measure is used to compare the intensities. In thecase of the sum-of-square differences, the new similarity measurebecomes: $\begin{matrix}{{E_{c}\left( {u,v} \right)} = {\sum\limits_{i = {- n}}^{n}{\sum\limits_{j = {- n}}^{n}\left( \begin{matrix}\left\lbrack {{I\left( {{x - u_{o} + i},{y - v_{o} + j},{t - 1}} \right)} -} \right. \\{\left. {I\left( {{x + u + i},{y + v + j},{t + 1}} \right)} \right\rbrack^{2} +} \\\left\lbrack {{I\left( {{x + i},{y + j},t} \right)} - {I\left( {{x + u + i},{y + v + j},{t + 1}} \right)}} \right\rbrack^{2}\end{matrix} \right.}}} & (16)\end{matrix}$where the first term compares intensities in frames o_(i) and y_(i),i.e. at times t−1 and t+1, and the second term compares intensitiesbetween frames x_(i) and y_(i), i.e. at times t and t+1.

In the case of CD_(2-bis) similarity measure defined above, thecalculation of E over three frames becomes: $\begin{matrix}{E_{i}^{{CD}_{2 - {bis}}} = {\left\lbrack {{\sum\limits_{j = 1}^{{2n} + 1}{\overset{\sim}{o}}_{ij}} - {\overset{\sim}{y}}_{ij} - {\ln\left( {{\exp\left( {2\left( {{\overset{\sim}{o}}_{ij} - {\overset{\sim}{y}}_{ij}} \right)} \right)} + 1} \right)}} \right\rbrack + {\quad\left\lbrack {{\sum\limits_{j = 1}^{{2n} + 1}{\overset{\sim}{x}}_{ij}} - {\overset{\sim}{y}}_{ij} - {\ln\left( {{\exp\left( {2\left( {{\overset{\sim}{x}}_{ij} - {\overset{\sim}{y}}_{ij}} \right)} \right)} + 1} \right)}} \right\rbrack}}} & (17)\end{matrix}$or in more detail: $\begin{matrix}{{E_{c}\left( {u,v} \right)} = {\sum\limits_{i = {- n}}^{n}{\sum\limits_{j = {- n}}^{n}\begin{pmatrix}{\begin{pmatrix}{{\overset{\sim}{I}\left( {{x - u_{o} + i},{y - v_{o} + j},{t - 1}} \right)} - {\overset{\sim}{I}\left( {{x + u + i},{y + v + j},{t + 1}} \right)} -} \\{\ln\left( {\exp\left( {2\left\lbrack {{\overset{\sim}{I}\left( {{x - u_{o} + i},{y - v_{o} + j},{t - 1}} \right)} -} \right.} \right.} \right.} \\\left. {\left. \left. {\overset{\sim}{I}\left( {{x + u + i},{y + v + j},{t + 1}} \right)} \right\rbrack \right) + 1} \right)\end{pmatrix} +} \\\begin{pmatrix}{{\overset{\sim}{I}\left( {{x + i},{y + j},t} \right)} - {\overset{\sim}{I}\left( {{x + u + i},{y + v + j},{t + 1}} \right)}} \\{\ln\left( {{\exp\left( {2\left\lbrack {{\overset{\sim}{I}\left( {{x + i},{y + j},t} \right)} - {\overset{\sim}{I}\left( {{x + u + i},{y + v + j},{t + 1}} \right)}} \right\rbrack} \right)} + 1} \right)}\end{pmatrix}\end{pmatrix}}}} & (18)\end{matrix}$Here {overscore (I)} represents the intensity data I transformed asdetailed above (but only, of course, within the interesting block, notfor the whole image).

This avoids the assumption that the velocity is the same over the threeframes. Instead it looks for the best match in frame y_(i) to the blockin x_(i) and the calculated previous position of the block (in o_(i)).It improves the matching process especially at low frame rates, e.g. of20-30 Hz. This makes it particularly useful in the case of contrastechocardiography, abdominal imaging, tissue Doppler and real-time3D-imaging, where low frame rates are typical.

Having established the new similarity measure, the next stage is tocalculate a probability mass function from the similarity measure. Inthe Singh approach this was by equation (2) above. As discussed above,that involved setting a value of k for each position in the frame suchthat the maximum response in the search window was close to unity.However, in this embodiment of the invention the value of k is, instead,set to be the same for all positions in the frame and all frames in thesequence. The value of k is found in an optimisation approach which willbe described below. Given the value of k the probability mass functionfor this embodiment is given by $\begin{matrix}{{{R_{c}\left( {u,v} \right)} = {\frac{1}{Z}\quad{\exp\left( {\frac{k}{\left( {{2n} + 1} \right)^{2}}\left( {{E_{c}\left( {u,v} \right)} - m} \right)} \right)}}},} & (19)\end{matrix}$where m is the maximum of the similarity measure in the search windowW_(s) (i.e. for −N≦u, v≦N) which is deducted from E_(c)(u,v) to avoidnumerical instabilities.

Thus it can be seen that the similarity measure is modified by dividingthe value of k by the size of the block W_(c). This is necessary so thatthe optimised value of k calculated for one image sequence can be usedat all scales and resolutions (i.e. regardless of the size of the blockW_(c) chosen) for that sequence.

The values of the response R_(c) calculated using this equation are thenused to calculate expected values of the velocity (u_(cc), v_(cc)) andthe corresponding covariance matrices using equations (4), (5) and (8)above. However, in this embodiment the calculation of the velocities(u_(cc), v_(cc)) is further modified by using only candidate velocitieswhich have probabilities above a certain threshold α in the velocityestimate of equations (4) and (5) however all candidate velocities areused in the covariance calculation. Thus in this embodiment the velocityestimates are calculated as follows: $\begin{matrix}\begin{matrix}{u_{cc}^{T} = \frac{\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{u\quad{R_{c}^{T}\left( {u,v} \right)}}}}{\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{R_{c}^{T}\left( {u,v} \right)}}}} \\{v_{cc}^{T} = \frac{\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{v\quad{R_{c}^{T}\left( {u,v} \right)}}}}{\sum\limits_{u = {- N}}^{N}{\sum\limits_{v = {- N}}^{N}{R_{c}^{T}\left( {u,v} \right)}}}} \\{{where},} \\{{R_{c}^{T}\left( {u,v} \right)} = \left\{ \begin{matrix}{R_{c}\left( {u,v} \right)} & {{{if}\quad{R_{c}\left( {u,v} \right)}} \geq \alpha} \\0 & {otherwise}\end{matrix} \right.}\end{matrix} & (20)\end{matrix}$The threshold α is defined as follows:α=−T(

−

) with Tε[0,1]where

and

are the maximum and minimum of the probability mass function R_(c) (u,v)respectively.

Thus it can be seen that if T is set to 1, the threshold a becomes theminimum value of R_(c), meaning that all values of the candidatevelocities are used in the calculation, and the calculation becomesequivalent to that in the Singh approach. If T=0, on the other hand, thethreshold a becomes the maximum value of the response so that only thecandidate velocity with maximum probability is taken as the estimatedvelocity. Thus the estimate would be totally biased towards thepredominant mode. In fact the value of T, optimised in the sameoptimisation process as that used for k, to be explained below, inpractice will be between zero and one.

The estimates of velocity and the covariance matrices are used togetherwith neighbourhood information in the iterative process described aboveto calculate the optimised values of velocity in accordance withequation (12) above.

FIG. 4 illustrates schematically how the values of k and T are optimisedtogether in a 2D space. In step 40 the sequence of images is taken andin step 41 the values of k and T are initialised. Then in step 42 theimage velocity is estimated using the initial values of k and T. Theseinitial values may be chosen from experience based on the type ofimaging equipment and the subject of the imaging sequence. The processis relatively robust to the choice of k and T, so, for example, initialvalues of T=0.5 and k=0.5 may be suitable for an ultrasound imagingsequence. Having calculated the image velocity in step 42 it is thenpossible in step 43 to register all of the subsequent frames to thefirst frame. “Registering” frames is equivalent to superimposing theimages one upon the other and adjusting their relative position to getthe best match. In practice the process involves correcting thesubsequent frames for motion using the calculated image velocity. Havingregistered the frames a registration error ξ is calculated using anerror function in step 44. As an example, the error function may be asum of square differences in the intensities of the frames. If the imagevelocity estimation were perfect, there would be no difference inintensities (as the motion correction would be perfect) and thus theerror function would be zero. In practice, of course, the error functionis non-zero and so in step 45 the values of k and T are varied tominimise the error function ξ. This may be achieved using amultidimensional minimisation algorithm such as the Powell algorithm(see William H. Press et al., “Numerical recipes in C: The art ofscientific computing”, Cambridge University Press). The optimisationprocess may be continued until the change in the value of the errorfunction ξ is below a certain threshold. In one experiment to compensatea breast compression sequence for distortion the optimal values werefound to be T=0.660 and k=0.237. FIG. 6 shows the results of theexperiment conducted on the ultrasound breast data. The error shown isthe registration error ξ. Two observations can be made:

-   -   1. For T=0, the velocity estimation is equivalent to taking the        argument of the maximum of the probability. Hence,        theoretically, the parameter k does not have any influence on        the result. This can easily be observed in this experiment, and        it corresponds to the maximum error. In this case, the image        velocity is quantified by the pixel resolution of the image, and        hence the error on the image velocity is of the order of the        pixel resolution. Furthermore, this approach is not robust        against noise. This explains this high error on the velocity        estimation.    -   2. For T=1 (as in Singh), the velocity estimation is equivalent        to taking the mean of the probability. The results are better        than for T=0, but do not correspond to the optimal value. This        result can be explained by the fact that taking the mean of the        probability as an estimates of the velocity is not very precise        and may lead to biased estimation if the pdf is not mono-modal        or non-well-peaked pdfs. Observe as well the expected functional        dependence between the two parameters (T and k). This last point        indicates that the search for the optimal values of T and k        should be done in the 2D space. In this experiment the optimal        values are T=0.660 and k=0.237. Thus showing a clear distinction        from the Singh result.

It should be noted that the improvements above may be used in acoarse-to-fine strategy, i.e. a multiresolution approach in whichvelocities are first estimated at a low resolution, then at a next finerresolution these estimates are used as a first guess in the estimationprocess and the estimates are refined. This means that instead ofsearching in the window around (x, y) in the second frame, one cansearch around (x+u_(est), y+v_(est)) where u_(est) and v_(est) arevelocity estimates propagated from the coarser level. This approach iscomputationally efficient. Further, the image velocity estimation may beconcentrated on regions of the image in motion, rather than conductedover the whole image.

FIG. 5 illustrates the process flow for the image velocity estimationgiven values of k and T (e.g. initial values if this is the firstestimate for a given application, or optimised values). Firstly, in step50, a sequence of image frames is taken. Then, in step 51, thesimilarity measure across three frame sets of the sequence is calculatedusing the CD_(2-bis) similarity measure, i.e. using equation (18) at thedesired scale and resolution. “Resolution” means whether one is samplingevery pixel, or only certain pixels in the block W_(c) and “scale”refers to how far the block is displaced in the search window W_(s),e.g. by one pixel, or by several pixels. Having calculated thesimilarity values, the value of the response R_(c) can be calculated instep 52 using equation (19). Then in step 53 the value of U_(cc) iscalculated using equation (20) and the corresponding covariance matrixS_(cc) using equation (8). In step 54 the value of {overscore (U)} andthe covariance for the neighbourhood estimate is calculated usingequations (6), (7) and (9). Then in step 55 the conservation andneighbourhood information are fused using the iterative process ofequation (12) to give an optimised velocity estimate U_(op).

As indicated by step 56, the process may be repeated at finer scales andresolutions, with the computational burden being eased by making use ofthe image velocity estimate already obtained.

The above improvements in the block matching technique are particularlysuccessful in allowing tracking of cardiac boundary pixels inechocardiographic sequences. The block matching steps may beconcentrated in a ribbon (band) around a contour defining the cardiacborder to reduce the computational burden. However, the technique isapplicable to other non-cardiac applications of ultrasound imaging.

1. A method of processing a sequence of image frames to estimate imagevelocity through the sequence comprising: block matching using asimilarity measure by comparing the intensities in image blocks in twoframes of the sequence and calculating the similarity between the saidblocks on the basis of their intensities, calculating from thesimilarity a probability measure that the two compared blocks are thesame, and estimating the image velocity based on the probabilitymeasure, wherein the probability measure is calculated using aparametric function of the similarity which is independent of positionin the image frames.
 2. A method according to claim 1 wherein theparameters of the parametric function are independent of position in theimage frames.
 3. A method according to claim 2 wherein at least one ofthe parameters is optimised by coregistering the frames in the sequenceon the basis of the calculated image velocity, calculating aregistration error and varying at least one of the parameters tominimise the registration error.
 4. A method according to claim 3wherein the registration error is calculated from the differences of theintensities in the coregistered frames.
 5. A method according to claim 4wherein the registration error is calculated from the sum of the squaresof the differences of the intensities in the coregistered frames.
 6. Amethod according to claim 1 further comprising the step of normalisingthe calculated similarity with respect to the size of the block andcalculating the probability measure on the basis of the normalisedsimilarity.
 7. A method according to claim 6 wherein the calculatedsimilarity is normalised by dividing it by the number of image samplesin the block.
 8. A method according to claim 6 wherein the calculatedsimilarity is normalised by dividing it by the number of pixels in theblock.
 9. A method according to claim 1 wherein the probability measureis a monotonic function of the similarity.
 10. A method according toclaim 1 wherein the probability measure is thresholded such that motionsin the image velocity whose probabilities have a predefined relationshipwith a threshold are ignored.
 11. A method according to claim 10 whereinthe threshold is optimised by coregistering the frames in the sequenceon the basis of the calculated image velocity, calculating aregistration error and varying the threshold to minimise theregistration error.
 12. A method according to claim 10 wherein thethreshold is positionally independent.
 13. A method according to claim10, wherein the threshold and parameters are optimised together.
 14. Amethod according to claim 1 further comprising normalising theintensities in the two blocks to have the same mean and standarddeviation before calculating said similarity.
 15. A method according toclaim 1 wherein the similarity measure is the CD_(2-bis) similaritymeasure.
 16. A method according to claim 1 wherein the block matching isconducted across three frames of the sequence by comparing theintensities in blocks in the first and third and the second and third ofthe three frames and calculating the similarity from said comparedintensities.
 17. A method according to claim 16 wherein the blocks inthe first and second frames are blocks calculated as corresponding toeach other on the basis of a previous image velocity estimate.
 18. Amethod of processing a sequence of image frames to estimate imagevelocity through the sequence comprising: block matching using asimilarity measure by comparing the intensities in image blocks in threeframes of the sequence by comparing the intensities in blocks in thefirst and third and the second and third of the three frames, andcalculating the similarity between the said blocks on the basis of theirintensities.
 19. A method according to claim 18 wherein the blocks inthe first and second frames are blocks calculated as corresponding toeach other on the basis of a previous image velocity estimate.
 20. Amethod according to claim 19 comprising defining for each block in thesecond frame a search window encompassing several blocks in the thirdframe, and calculating the similarity of each block in the search windowto the said block in the second frame and to the corresponding positionof the said block in the first frame based on the previous imagevelocity estimate.
 21. A method of processing a sequence of image framesto estimate image velocity through the sequence comprising: blockmatching using a similarity measure by comparing the intensities inimage blocks in two frames of the sequence and calculating thesimilarity between the said blocks on the basis of their intensities,further comprising normalising the intensities in the two blocks to havethe same mean and standard deviation before calculating said similarity.22. A method according to claim 21 wherein the similarity measure is theCD_(2-bis) similarity measure.
 23. A method according to claim 21wherein the block matching is conducted across three frames of thesequence by comparing the intensities in blocks in the first and thirdand the second and third of the three frames and calculating thesimilarity from said compared intensities.
 24. A method according toclaim 23 wherein the blocks in the first and second frames are blockscalculated as corresponding to each other on the basis of a previousimage velocity estimate.
 25. A method according to claim 1 wherein theimage velocity estimate is refined by modifying the image velocityestimate at each position in the image with the estimated image velocityat surrounding positions.
 26. A method according to claim 1 wherein theimages are medical images.
 27. A method according to claim 1 wherein theimages are ultrasound images.
 28. Image processing apparatus comprisingan image velocity estimator adapted to estimate image velocity inaccordance with the method of claim
 1. 29. A computer program comprisingprogram code means for executing on a programmed computer the method ofclaim
 1. 30. A computer-readable storage medium storing a computerprogram according to claim 29.